Parsing Chinese With an Almost-Context-Free Grammar
نویسندگان
چکیده
We describe a novel parsing strategy we are employing for Chinese. We believe progress in Chinese parsing technology has been slowed by the excessive ambiguity that typically arises in pure context-free grammars. This problem has inspired a modified formalism that enhances our ability to write and maintain robust large grammars, by constraining productions with left/right contexts and/or nonterminal functions. Parsing is somewhat more expensive than for pure context-free parsing, but is still efficient by both theoretical and empirical analyses. Encouraging experimental results with our current grammar are described.
منابع مشابه
Corpus-oriented Acquisition of Chinese Grammar
The acquisition of grammar from a corpus is a challenging task in the preparation of a knowledge bank. In this paper, we discuss the extraction of Chinese grammar oriented to a restricted corpus. First, probabilistic context-free grammars (PCFG) are extracted automatically from the Penn Chinese Treebank and are regarded as the baseline rules. Then a corpusoriented grammar is developed by adding...
متن کاملImproving Part-of-speech Tagging for Context-free Parsing
In this paper, we propose a factored parsing model consisting of a lexical and a constituent model. The discriminative lexical model allows the parser to utilize rich contextual features beyond those encoded in the context-free grammar (CFG) in use. Experiment results reveal that our parser achieves statistically significant improvement in both parsing and tagging accuracy on both English and C...
متن کاملPCFG Parsing for Restricted Classical Chinese Texts
The Probabilistic Context-Free Grammar (PCFG) model is widely used for parsing natural languages, including Modern Chinese. But for Classical Chinese, the computer processing is just commencing. Our previous study on the part-of-speech (POS) tagging of Classical Chinese is a pioneering work in this area. Now in this paper, we move on to the PCFG parsing of Classical Chinese texts. We continue t...
متن کاملChinese Syntactic Parsing Based on Extended GLR Parsing Algorithm with PCFG*
This paper presents an extended GLR parsing algorithm with grammar PCFG* that is based on Tomita’s GLR parsing algorithm and extends it further. We also define a new grammar—PCFG* that is based on PCFG and assigns not only probability but also frequency associated with each rule. So our syntactic parsing system is implemented based on rule-based approach and statistics approach. Furthermore our...
متن کاملParsing Extended Constraint Synchronous Grammar in Chinese-Portuguese Machine Translation
This paper presents a design model for parsing an extended version of Constraint Synchronous Grammar (CSG) for Chinese-Portuguese MT, which was initially applied in the reverse way. CSG is a generalization of context free grammars that describes syntactic structures of two languages simultaneously based on the defined constraints. Extensions include the addition of part of speech information in...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1996